A Dataflow-Oriented Atomicity and Provenance System for Pipelined Scientific Workflows

نویسندگان

  • Liqiang Wang
  • Shiyong Lu
  • Xubo Fei
  • Jeffrey L. Ram
چکیده

Scientific workflows have gained great momentum in recent years due to their critical roles in e-Science and cyberinfrastructure applications. However, some tasks of a scientific workflow might fail during execution. A domain scientist might require a region of a scientific workflow to be “atomic”. Data provenance, which determines the source data that are used to produce a data item, is also essential to scientific workflows. In this paper, we propose: (i) an architecture for scientific workflow management systems that supports both provenance and atomicity; (ii) a dataflow-oriented atomicity model that supports the notions of commit and abort; and (iii) a dataflow-oriented provenance model that, in addition to supporting existing provenance graphs and queries, also supports queries related to atomicity and failure.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Atomicity and provenance support for pipelined scientific workflows

Today many significant scientific discoveries are achieved through complex and distributed scientific computations that are structured and represented as scientific workflows. Although atomicity is a well studied topic in transaction processing and business workflows, such an important capability needs to be revisited in a scientific workflow environment. Firstly, the semantics of atomicity nee...

متن کامل

A Model for User-Oriented Data Provenance in Pipelined Scientific Workflows

Integrated provenance support promises to be a chief advantage of scientific workflow systems over script-based alternatives. While it is often recognized that information gathered during scientific workflow execution can be used automatically to increase fault tolerance (via checkpointing) and to optimize performance (by reusing intermediate data products in future runs), it is perhaps more si...

متن کامل

Actor-Oriented Design of Scientific Workflows

Scientific workflows are becoming increasingly important as a unifying mechanism for interlinking scientific data management, analysis, simulation, and visualization tasks. Scientific workflow systems are problem-solving environments, supporting scientists in the creation and execution of scientific workflows. While current systems permit the creation of executable workflows, conceptual modelin...

متن کامل

NiW: Converting Notebooks into Workflows to Capture Dataflow and Provenance

Interactive notebooks are increasingly popular among scientists to expose computational methods and share their results. However, it is often challenging to track their dataflow, and therefore the provenance of their results. This paper presents an approach to convert notebooks into scientific workflows that capture explicitly the dataflow across software components and facilitate tracking prov...

متن کامل

Context-aware scientific workflow systems using KEPLER

Data intensive scientific workflows are often modelled using a dataflow-oriented model. The simplicity of a dataflow model facilitates intuitive workflow design, analysis and optimisation. However, some amount of control flow modelling is often necessary for engineering fault tolerant, robust and adaptive workflows. In scientific domain, myriads of environment information are needed for control...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007